Name: Chirag Shah & Kunal Tolani
Student ID: 19200072 & 19200153
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Conv2D, MaxPooling2D, Flatten
from keras.utils import np_utils
from keras import backend as K
from keras.utils.np_utils import to_categorical
from keras.utils.vis_utils import model_to_dot
from keras.optimizers import RMSprop, adam
from keras import backend as K
from keras.utils import np_utils
from keras.callbacks import ModelCheckpoint
from keras.preprocessing.image import ImageDataGenerator
from keras.wrappers.scikit_learn import KerasClassifier
import sklearn
from sklearn.tree import export_graphviz
from sklearn import metrics
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.utils import shuffle
from sklearn.linear_model import LogisticRegression
from IPython.display import SVG
import csv
import os
import cv2
import random
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import scipy as sp
import PIL
Data was loaded on kaggle and imported to Google Colab. This was performed because data on Colab was lost in case of network unavailability. Also, another option was to upload data to Google drive and read, but we observed that the speed of read operation from Google drive was very slow.
To successfully load the dataset, the command files.upload() requires kaggle.json config file to be uploaded. This file is available in the zip file of the solution.
Warning: This will download entire dataset from kaggle
! pip install -q kaggle
#from google.colab import files
#files.upload()
! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json
! kaggle datasets download -d kunal4/pneumonia
! unzip pneumonia.zip -d pneumonia
# location of data
dataset_name = 'pneumonia/chest_xray'
# train and test data directories
train_data_dir = dataset_name + '/train/'
test_data_dir = dataset_name + '/test/'
# Data sample rate - using the entire data
sample_rate = 1.0
# Image dimensions - 162 x 128
img_width, img_height = 162, 128
# different orderding of channels according to models.
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
training_class_folders = [i for i in os.listdir(train_data_dir) if not i.startswith('.')] # use this for full dataset
num_classes = len(training_class_folders)
# Initialise arrays for data storage
X_train = np.ndarray((0, input_shape[0], input_shape[1], input_shape[2]), dtype=np.float)
y_train = np.ndarray(0, dtype=np.str)
# Loop through the class folders
for i, image_cls in enumerate(training_class_folders):
print('Processing class {}'.format(image_cls))
image_class_folder = train_data_dir + image_cls + "/"
# generate filenames from the data folder and do sampling
image_filenames = [image_class_folder+i for i in os.listdir(image_class_folder) if not i.startswith('.')] # use this for full dataset
image_filenames = random.sample(image_filenames, int(len(image_filenames)*sample_rate))
# Create a data array for image data
count = len(image_filenames)
X_train_part = np.ndarray((count, input_shape[0], input_shape[1], input_shape[2]), dtype=np.float)
# Iterate throuigh the filenames and for each one load the image, resize and normalise
for i, image_file in enumerate(image_filenames):
# Low the images and resize them
image = cv2.imread(image_file, cv2.IMREAD_COLOR)
image = cv2.resize(image, (img_height, img_width), interpolation=cv2.INTER_CUBIC)
image = image[:,:,[2,1,0]] # OpenCV and matplotlib use differnet channel oerderings so fix this
# If channel order of network does not match open cv format swap it
if K.image_data_format() == 'channels_first':
image=np.swapaxes(np.swapaxes(image, 1, 2), 0, 1)
# Add image data to data array and normalise
X_train_part[i] = image
X_train_part[i] = X_train_part[i]/255
# Add label to label array
y_train = np.append(y_train, image_cls)
if i%100 == 0: print('Processed {} of {} for class {} '.format(i, count, image_cls))
print('Processed {} of {} for class {} '.format(i + 1, count, image_cls))
# Append the part to the overall data array
X_train = np.append(X_train, X_train_part, axis=0)
print("Data shape: {}".format(X_train.shape))
testing_class_folders = [i for i in os.listdir(test_data_dir) if not i.startswith('.')] # use this for full dataset
num_classes = len(testing_class_folders)
# Initialise arrays for data storage
X_test_data = np.ndarray((0, input_shape[0], input_shape[1], input_shape[2]), dtype=np.float)
y_test_data = np.ndarray(0, dtype=np.str)
# Loop through the class folders
for i, image_cls in enumerate(testing_class_folders):
print('Processing class {}'.format(image_cls))
image_class_folder = test_data_dir + image_cls + "/"
# generate filenames from the data folder and do sampling
image_filenames = [image_class_folder+i for i in os.listdir(image_class_folder) if not i.startswith('.')] # use this for full dataset
image_filenames = random.sample(image_filenames, int(len(image_filenames)*sample_rate))
# Create a data array for image data
count = len(image_filenames)
X_test_part = np.ndarray((count, input_shape[0], input_shape[1], input_shape[2]), dtype=np.float)
# Iterate throuigh the filenames and for each one load the image, resize and normalise
for i, image_file in enumerate(image_filenames):
# Low the images and resize them
image = cv2.imread(image_file, cv2.IMREAD_COLOR)
image = cv2.resize(image, (img_height, img_width), interpolation=cv2.INTER_CUBIC)
image = image[:,:,[2,1,0]] # OpenCV and matplotlib use differnet channel oerderings so fix this
# If channel order of network does not match open cv format swap it
if K.image_data_format() == 'channels_first':
image=np.swapaxes(np.swapaxes(image, 1, 2), 0, 1)
# Add image data to data array and normalise
X_test_part[i] = image
X_test_part[i] = X_test_part[i]/255
# Add label to label array
y_test_data = np.append(y_test_data, image_cls)
if i%100 == 0: print('Processed {} of {} for class {} '.format(i, count, image_cls))
print('Processed {} of {} for class {} '.format(i + 1, count, image_cls))
# Append the part to the overall data array
X_test_data = np.append(X_test_data, X_test_part, axis=0)
print("Data shape: {}".format(X_test_data.shape))
# Perfrom split to train, validation, test
X_train_data, X_val_data, y_train_data, y_val_data = train_test_split(X_train, y_train, random_state=0, test_size = 0.3, train_size = 0.7, shuffle=True)
# Convert class vectors to binary class matrices.
y_train_encoder = sklearn.preprocessing.LabelEncoder()
y_train_num = y_train_encoder.fit_transform(y_train_data)
y_train_wide = keras.utils.to_categorical(y_train_num, num_classes)
y_test_num = y_train_encoder.fit_transform(y_test_data)
y_test_wide = keras.utils.to_categorical(y_test_num, num_classes)
y_val_num = y_train_encoder.fit_transform(y_val_data)
y_val_wide = keras.utils.to_categorical(y_val_num, num_classes)
#Mapping labels to binary classes
classes_num_label = dict()
for idx, lbl in enumerate(y_train_encoder.classes_):
classes_num_label[idx] = lbl
classes_num_label
pltsize=4
row_images = 5
col_images = 5
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(row_images * col_images):
i_rand = random.randint(0, X_train.shape[0])
plt.subplot(row_images,col_images,i+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_train[i_rand] * 255).astype(np.uint8)))
plt.title((str(i_rand) + " " + y_train[i_rand]))
#Flatten data for use in Logistic Regression
X_train_flat = X_train_data.reshape(X_train_data.shape[0], X_train_data.shape[1]*X_train_data.shape[2]*X_train_data.shape[3])
X_test_flat = X_test_data.reshape(X_test_data.shape[0], X_test_data.shape[1]*X_test_data.shape[2]*X_test_data.shape[3])
X_val_flat = X_val_data.reshape(X_val_data.shape[0], X_val_data.shape[1]*X_val_data.shape[2]*X_val_data.shape[3])
log_reg = LogisticRegression(max_iter=10000)
log_reg.fit(X_train_flat,y_train_data)
train_acc = {}
y_pred_lr = log_reg.predict(X_train_flat)
print(metrics.classification_report(y_train_data, y_pred_lr))
print("Confusion matrix")
print(metrics.confusion_matrix(y_train_data, y_pred_lr))
train_acc['Logistic Regression'] = metrics.accuracy_score(y_train_data, y_pred_lr)
test_acc = {}
y_pred_lr_test = log_reg.predict(X_test_flat)
print(metrics.classification_report(y_test_data, y_pred_lr_test))
print("Confusion matrix")
print(metrics.confusion_matrix(y_test_data, y_pred_lr_test))
test_acc['Logistic Regression'] = metrics.accuracy_score(y_test_data, y_pred_lr_test)
As a simple base model, we used Logistic Regression with the default parameters, with only a large number of maximum iterations to allow the model to converge.
The results on test set were fine - especially a recall score of 0.98 in pneumonia patients. There are a large number of false positives but this is expected from a base classifier like Logistic Regression.
from sklearn.utils import class_weight
class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train), y_train)
model = Sequential()
model.add(Conv2D(32, (5, 5), input_shape=(162,128,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (5, 5)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(84, activation='relu'))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
optimizer = keras.optimizers.Adam(lr=0.0001)
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
model.summary()
batch_size = 128
epochs = 50
# Set up the callback to save the best model based on validaion data
best_weights_filepath = './simple_cnn_unbalanced.hdf5'
mcp = ModelCheckpoint(best_weights_filepath, monitor="val_loss",
save_best_only=True, save_weights_only=False)
history = model.fit(X_train_data, y_train_wide,
batch_size=batch_size,
epochs=epochs,
verbose = 1,
validation_data=(X_val_data, y_val_wide),
shuffle=True,
callbacks=[mcp])
#reload best weights
model.load_weights(best_weights_filepath)
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.plot(loss, 'blue', label='Training Loss')
plt.plot(val_loss, 'green', label='Validation Loss')
plt.xticks(range(0,epochs)[0::2])
plt.legend()
plt.show()
# Make a set of predictions for the training data
pred_no_weights = model.predict_classes(X_train_data)
# Print performance details
print(metrics.classification_report(y_train_num, pred_no_weights))
print("Confusion matrix")
print(metrics.confusion_matrix(y_train_num, pred_no_weights))
train_acc['LeNet5 Unbalanced'] = metrics.accuracy_score(y_train_num, pred_no_weights)
# Make a set of predictions for the test data
pred_no_weights_test = model.predict_classes(X_test_data)
# Print performance details
print(metrics.classification_report(y_test_num, pred_no_weights_test))
print("Confusion matrix")
print(metrics.confusion_matrix(y_test_num, pred_no_weights_test))
test_acc['LeNet5 Unbalanced'] = metrics.accuracy_score(y_test_num, pred_no_weights_test)
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_no_weights_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_no_weights_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if not corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
The standard LeNet-5 architecture was used in building the CNN classifier. The results are better than base model(Logistic Regression). There are still large number of false positives(predicted pneumonia for normal) because there is high bias towards pneumonia patients in the dataset.
The number of true negatives(predicted normal for normal) is higher than the results in base model which is good.
For regularisation, we introduced a dropout layer to avoid overfitting of data.
The dataset is highly imbalanced and hence we try to solve this issue further in class weights.
model = Sequential()
model.add(Conv2D(32, (5, 5), input_shape=(162,128,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (5, 5)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(84, activation='relu'))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
optimizer = keras.optimizers.Adam(lr=0.0001)
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
model.summary()
batch_size = 128
epochs = 50
# Set up the callback to save the best model based on validaion data
best_weights_filepath = './simple_cnn_balanced.hdf5'
mcp = ModelCheckpoint(best_weights_filepath, monitor="val_loss",
save_best_only=True, save_weights_only=False)
history = model.fit(X_train_data, y_train_wide, class_weight=class_weights,
batch_size=batch_size,
epochs=epochs,
verbose = 1,
validation_data=(X_val_data, y_val_wide),
shuffle=True,
callbacks=[mcp])
#reload best weights
model.load_weights(best_weights_filepath)
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.plot(loss, 'blue', label='Training Loss')
plt.plot(val_loss, 'green', label='Validation Loss')
plt.xticks(range(0,epochs)[0::2])
plt.legend()
plt.show()
# Make a set of predictions for the training data
pred_weights = model.predict_classes(X_train_data)
# Print performance details
print(metrics.classification_report(y_train_num, pred_weights))
print("Confusion matrix")
print(metrics.confusion_matrix(y_train_num, pred_weights))
train_acc['LeNet5 Balanced'] = metrics.accuracy_score(y_train_num, pred_weights)
# Make a set of predictions for the test data
pred_weights_test = model.predict_classes(X_test_data)
# Print performance details
print(metrics.classification_report(y_test_num, pred_weights_test))
print("Confusion matrix")
print(metrics.confusion_matrix(y_test_num, pred_weights_test))
test_acc['LeNet5 Balanced'] = metrics.accuracy_score(y_test_num, pred_weights_test)
display(X_train_data.shape)
display(X_val_data.shape)
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_weights_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_weights_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if not corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
The results show that there is not much of improvement after introducing class weights. However, even if we there were much of an effect, it would be counter-intuitive to our needs since we it is more important to detect pneumonia rather than lack of pneumonia. Hence, the bias in dataset is justified.
pltsize=4
row_images = 4
col_images = 4
# Create a transformed data generator
datagen = ImageDataGenerator(
featurewise_center=True,
width_shift_range=0.1,
height_shift_range=0.1, fill_mode='constant', cval = 0 ,zoom_range=[0.8,1.2])
# fit parameters from data
datagen.fit(X_train)
for idx in range(0, 4):
# Plot the original image
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
plt.subplot(row_images,col_images,1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_train[idx] * 255).astype(np.uint8)))
plt.title("Original")
for i in range(row_images * col_images - 1):
rand_trans = datagen.random_transform(X_train[idx])
plt.subplot(row_images,col_images,i+2)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((rand_trans * 255).astype(np.uint8)))
plt.title(i)
plt.show()
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(162,128,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(84, activation='relu'))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
optimizer = keras.optimizers.Adam(lr=0.001)
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
model.summary()
batch_size = 128
epochs = 20
# Set up the callback to save the best model based on validaion data
best_weights_filepath = './data_augmentation.hdf5'
mcp = ModelCheckpoint(best_weights_filepath, monitor="val_loss",
save_best_only=True, save_weights_only=False)
# Create a data generator for the trianing data
datagen_train = ImageDataGenerator(
width_shift_range=0.2,
height_shift_range=0.2,zoom_range=[0.8,1.2], fill_mode='constant', cval = 0)
datagen_train.fit(X_train)
history = model.fit_generator(datagen_train.flow(X_train_data, y_train_wide, batch_size=batch_size),
validation_data = (X_val_data, y_val_wide),
epochs=epochs,
verbose = 1,
shuffle=True,
callbacks=[mcp])
#reload best weights
model.load_weights(best_weights_filepath)
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.plot(loss, 'blue', label='Training Loss')
plt.plot(val_loss, 'green', label='Validation Loss')
plt.xticks(range(0,epochs)[0::2])
plt.legend()
plt.show()
# Make a set of predictions for the training data
pred_aug = model.predict_classes(X_train_data)
# Print performance details
print(metrics.classification_report(y_train_num, pred_aug))
print("Confusion matrix")
print(metrics.confusion_matrix(y_train_num, pred_aug))
train_acc['CNN Augmentation'] = metrics.accuracy_score(y_train_num, pred_aug)
# Make a set of predictions for the test data
pred_aug_test = model.predict_classes(X_test_data)
# Print performance details
print(metrics.classification_report(y_test_num, pred_aug_test))
print("Confusion matrix")
print(metrics.confusion_matrix(y_test_num, pred_aug_test))
test_acc['CNN Augmentation'] = metrics.accuracy_score(y_test_num, pred_aug_test)
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_aug_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_aug_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if not corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
For image augmentation, we used height_shift_range, width_shift_range, zoom_range and fill_mode as parameters. For shifted and zoomed images, we decided to give a constant background of 0(black) using fill_mode.
We tried using zca_whitening and brightness_range as parameters but the model only predicted single class using them.
The results on test set after data augmentation are significantly better than previous models. This might be because the model was trained on more training data due to data augmentation.
vgg16_model = keras.applications.VGG16(weights='imagenet', include_top=False, input_shape = X_train[0].shape)
display(vgg16_model.summary())
vgg16_last_layer = vgg16_model.output
# build a classifier model to put on top of the VGG16 model
x1 = Flatten()(vgg16_last_layer)
x2 = Dense(256, activation='relu')(x1)
x3 = Dropout(0.5)(x2)
final_layer = Dense(num_classes, activation = 'softmax')(x3)
# Assemble the full model out of both parts
full_model = keras.Model(vgg16_model.input, final_layer)
# set the first 17 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in full_model.layers[:17]:
layer.trainable = False
# compile the model with a Adam optimizer
# and a very slow learning rate.
full_model.compile(loss='categorical_crossentropy',
optimizer=keras.optimizers.Adam(lr=0.001),
metrics=['accuracy'])
full_model.summary()
batch_size = 128
epochs = 20
# Set up the callback to save the best model based on validaion data - notebook 2.2 needs to be run first.
best_weights_filepath = './vgg16_freeze17.hdf5'
mcp = ModelCheckpoint(best_weights_filepath, monitor="val_loss",
save_best_only=True, save_weights_only=False)
history = full_model.fit(X_train_data, y_train_wide,
batch_size=batch_size,
epochs=epochs,
verbose = 1,
validation_data = (X_val_data,y_val_wide),
shuffle=True,
callbacks=[mcp])
#reload best weights
full_model.load_weights(best_weights_filepath)
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.plot(loss, 'blue', label='Training Loss')
plt.plot(val_loss, 'green', label='Validation Loss')
plt.xticks(range(0,epochs)[0::2])
plt.legend()
plt.show()
pred_vgg = np.argmax(full_model.predict(X_train_data),axis=1)
# Print performance details
print(metrics.classification_report(y_train_num, pred_vgg))
print("Confusion matrix")
print(metrics.confusion_matrix(y_train_num, pred_vgg))
train_acc['CNN VGG-16'] = metrics.accuracy_score(y_train_num, pred_vgg)
pred_vgg_test = np.argmax(full_model.predict(X_test_data),axis=1)
# Print performance details
print(metrics.classification_report(y_test_num, pred_vgg_test))
print("Confusion matrix")
print(metrics.confusion_matrix(y_test_num, pred_vgg_test))
test_acc['CNN VGG-16'] = metrics.accuracy_score(y_test_num, pred_vgg_test)
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_vgg_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_vgg_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if not corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
VGG-16 pre-trained model was used and we freezed weights for the top 17 layers. The last convolution layer was trainable. Transfer learning did not have any major impact as compared to simple LeNet-5 architecture.
We tried unfreezing couple more convolution layers, but that did not have enough effect.
vgg16_model = keras.applications.VGG16(weights='imagenet', include_top=False, input_shape = X_train[0].shape)
display(vgg16_model.summary())
vgg16_last_layer = vgg16_model.output
# build a classifier model to put on top of the VGG16 model
x1 = Flatten()(vgg16_last_layer)
x2 = Dense(256, activation='relu')(x1)
x3 = Dropout(0.5)(x2)
final_layer = Dense(num_classes, activation = 'softmax')(x3)
# Assemble the full model out of both parts
full_model = keras.Model(vgg16_model.input, final_layer)
# set the first 17 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in full_model.layers[:17]:
layer.trainable = False
# compile the model with a Adam optimizer
# and a very slow learning rate.
full_model.compile(loss='categorical_crossentropy',
optimizer=keras.optimizers.Adam(lr=0.001),
metrics=['accuracy'])
full_model.summary()
batch_size = 128
epochs = 20
# Set up the callback to save the best model based on validaion data - notebook 2.2 needs to be run first.
best_weights_filepath = './vgg16_freeze17_augmentation.hdf5'
mcp = ModelCheckpoint(best_weights_filepath, monitor="val_loss",
save_best_only=True, save_weights_only=False)
# Create a data generator for the trianing data
datagen_train = ImageDataGenerator(
width_shift_range=0.2,
height_shift_range=0.2,zoom_range=[0.8,1.2], fill_mode='constant', cval = 0)
datagen_train.fit(X_train)
history = full_model.fit_generator(datagen_train.flow(X_train_data, y_train_wide, batch_size=batch_size),
epochs=epochs,
verbose = 1,
validation_data = (X_val_data,y_val_wide),
shuffle=True,
callbacks=[mcp])
#reload best weights
full_model.load_weights(best_weights_filepath)
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.plot(loss, 'blue', label='Training Loss')
plt.plot(val_loss, 'green', label='Validation Loss')
plt.xticks(range(0,epochs)[0::2])
plt.legend()
plt.show()
pred_vgg_aug = np.argmax(full_model.predict(X_train_data),axis=1)
# Print performance details
print(metrics.classification_report(y_train_num, pred_vgg_aug))
print("Confusion matrix")
print(metrics.confusion_matrix(y_train_num, pred_vgg_aug))
train_acc['CNN VGG-16 aug'] = metrics.accuracy_score(y_train_num, pred_vgg_aug)
pred_vgg_aug_test = np.argmax(full_model.predict(X_test_data),axis=1)
# Print performance details
print(metrics.classification_report(y_test_num, pred_vgg_aug_test))
print("Confusion matrix")
print(metrics.confusion_matrix(y_test_num, pred_vgg_aug_test))
test_acc['CNN VGG-16 aug'] = metrics.accuracy_score(y_test_num, pred_vgg_aug_test)
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_vgg_aug_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
pltsize=4
row_images = 4
col_images = 4
maxtoshow = row_images * col_images
predictions = pred_vgg_aug_test.reshape(-1)
corrects = predictions == y_test_num
ii = 0
plt.figure(figsize=(col_images*pltsize, row_images*pltsize))
for i in range(X_test_data.shape[0]):
if ii>=maxtoshow:
break
if not corrects[i]:
plt.subplot(row_images,col_images, ii+1)
plt.axis('off')
plt.imshow(PIL.Image.fromarray((X_test_data[i] * 255).astype(np.uint8)))
plt.title("{} for {}".format(classes_num_label[predictions[i]], y_test_data[i]))
ii = ii + 1
Again, the results using data augmentation were better because more images were present in the train set.
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
traindf = pd.DataFrame(train_acc.items(), columns=['Model', 'Accuracy'])
display(traindf)
print("\n\n")
traindf.plot.bar(x = 'Model', y = 'Accuracy', ax = ax, legend = False, rot = '40')
plt.ylabel('Accuracy')
plt.ylim(ymin=0.6) # this line
testdf = pd.DataFrame(test_acc.items(), columns=['Model', 'Accuracy'])
display(testdf)
print()
fig, ax = plt.subplots()
testdf.plot(kind='bar', x = 'Model', y = 'Accuracy', ax = ax, legend = False, rot = '40')
plt.ylabel('Accuracy')
plt.ylim(ymin=0.6) # this line
Logistic regression model doesn't work as good as neural network models on image dataset. Neural networks are designed to self-learn features from images whereas for logistic regression, feature engineering has to be performed separately.
Handling class imbalance using class weights didn't have any significant impact on the results of LeNet5 CNN architecture.
Since the bias is towards Pneumonia, if we attempt to balance dataset, the bias would ideally be introduced towards normal classes. But detecting Pneumonia correctly is more important that detecting lack of pneumonia. Hence, for further processing we chose not to rectify class imbalance.
Models incorporating data augmentation technique work better than their base models. (This can be inferred from the bar plot that summarizes test accuracies of all the models)
Using transfer learning with pre-trained models is faster than training a full model from scratch while giving similar results.
In terms of accuracies -
VGG16 with Augmentation > LeNet5 with Augmentation > Vgg16 > LeNet5
Interesting behavior -
In LeNet5 data augmentation models, we see that throughout training process, validation loss is consistently lower than training loss. This could be because of number of reasons -
The validation loss is highly volatile in Vgg16 with data augmentation which seems strange. This could be because of number of factors namely regularisation or learning rate.
References -
Keras.io. 2020. FAQ - Keras Documentation. [online] Available at: https://keras.io/getting-started/faq/#why-is-the-training-loss-much-higher-than-the-testing-loss [Accessed 1 May 2020].
W., Mudau, T. and Beker, D., 2020. Why My Test Error Is Lower Then Train Error. [online] Artificial Intelligence Stack Exchange. Available at: https://ai.stackexchange.com/questions/4385/why-my-test-error-is-lower-then-train-error/4413#4413 [Accessed 1 May 2020].